Parallel k means clustering based on mapreduce pdf
نویسندگان
چکیده
Parallel K-Means Clustering Based on. Weizhong Zhao1, 2, Huifang Ma1, 2, and Qing He1. The Key Laboratory of Intelligent Information.The K-Means clustering is a basic method in analyzing RS remote sensing images.
منابع مشابه
Parallel K-Means Clustering Based on MapReduce
Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel clu...
متن کاملMapReduce K-Means based Co-Clustering Approach for Web Page Recommendation System
Co-clustering is one of the data mining techniques used for web usage mining. Co-clustering Web log data is the process of simultaneous categorization of both users and pages. It is used to extract the users’ information based on subset of pages. Nowadays, the cyberspace is filled with huge volume of data distributed across the world. The business knowledge acquaintance from such a voluminous d...
متن کاملComparing k-means clusters on parallel Persian-English corpus
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...
متن کاملA New Parallelization Method for K-means
K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce [3]. However, the existing k-means parallelization methods including PKMeans have many limitations. It can’t finish all its iterations in one MapReduce job, so it has to repeat cascading MapReduce jobs in a loop until convergence. On...
متن کاملA MapReduce-based parallel K-means clustering for large-scale CIM data verification
The Common Information Model (CIM) has been heavily used in electric power grids for data exchange among a number of auxiliary systems such as communication systems, monitoring systems and marketing systems. With an rapid deployment of digitalized devices in electric power networks, the volume of data continuously grows which makes verification of CIM data a challenging issue. This paper presen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015